This document provides a summary of various data sources used in our analysis, along with references to the R code that handles data reading and cleaning.
We are examining daily climate data from temperature stations across Canada, focusing on five key stations that are most representative of the entire province of British Columbia.
The dataset includes the following columns:
x: Longitude
y: Latitude
LOCAL_DATE: Date in the format year-month-day
TOTAL_PRECIPITATION: Total precipitation in mm
STATION_NAME: Name of the station
MEAN_TEMPERATURE: Mean of daily maximum and minimum temperatures
MAX_TEMPERATURE: Daily maximum temperature
MIN_TEMPERATURE: Daily minimum temperature
TOTAL_RAIN: Total rainfall
MIN_REL_HUMIDITY: Minimum relative humidity
LOCAL_YEAR: Year
LOCAL_MONTH: Month
Our goal is to identify stations with data extending back to 1941 in order to compare the heatwave of 1941 with that of 2021.
Link to data source: Daily Climate Station Data
Through the provided link, you can utilize the map tool to locate your desired station by either scrolling through the map or entering the station name or ID in the search bar. Please ensure you download the data in CSV format, rather than GeoJSON. (Specific station information, including location and ID, can be found in section 1.3.)
If the link does not work, the data can be accessed through the official website of the Government of Canada by following the navigation path outlined below:
Home > Environment and natural resources > Climate change > Climate change: our plan > Adapting to Climate Change > Canadian Centre for Climate Services > Display and Download Climate Data > Climate data extraction tool > Daily climate data (the link above)
| Station Name | Start Year | End Year | Station ID | Start Date | End Date |
|---|---|---|---|---|---|
| VANCOUVER INTL A | 1937 | 2024 | 1108447 | 1937-01-01 | 2024-08-01 |
| VANCOUVER INTL A | 1937 | 2024 | 1108395 | 2013-06-13 | 2024-08-01 |
| Station Name | Start Year | End Year | Station ID | Start Date | End Date |
|---|---|---|---|---|---|
| ABBOTSFORD A | 2012 | 2024 | 1100032 | 2012-06-21 | 2024-07-30 |
| ABBOTSFORD A | 1944 | 2024 | 1100030 | 1944-10-01 | 2012-06-20 |
| ABBOTSFORD UPPER SUMAS | 1935 | 1946 | 1100040 | 1935-11-01 | 1946-03-31 |
| Station Name | Start Year | End Year | Station ID | Start Date | End Date |
|---|---|---|---|---|---|
| PRINCE GEORGE A | 1942 | 2009 | 1096450 | 1942-07-01 | 2009-10-21 |
| PRINCE GEORGE | 2009 | 2024 | 1096439 | 2009-10-22 | 2024-07-30 |
| PRINCE GEORGE | 1912 | 1945 | 1096436 | 1912-08-01 | 1945-06-30 |
| Station Name | Start Year | End Year | Station ID | Start Date | End Date |
|---|---|---|---|---|---|
| FORT NELSON A | 1937 | 2012 | 1192940 | 1937-09-01 | 2012-11-14 |
| FORT NELSON A | 2012 | 2024 | 1192946 | 2012-11-08 | 2024-07-30 |
| Station Name | Start Year | End Year | Station ID | Start Date | End Date |
|---|---|---|---|---|---|
| KELOWNA | 1899 | 1962 | 1123930 | 1899-03-01 | 1962-09-30 |
| KELOWNA | 1961 | 1969 | 1123975 | 1961-08-01 | 1969-11-30 |
| KELOWNA A | 1968 | 2005 | 1123970 | 1968-10-01 | 2005-09-30 |
| KELOWNA | 2009 | 2024 | 1123939 | 2009-09-03 | 2024-07-30 |
| KELOWNA UBCO | 2013 | 2024 | 1123996 | 2013-12-16 | 2024-07-30 |
Kamloops: 1939-2024
Penticton: 1941-2024
FortStJohn: 1910-2024
The data is read using the process_and_save_data()
function, located in
../climate_extreme_RA/R/read_data_(#accrodingly station name).R
Within the process_and_save_data() function,
deal_with_non_exist_date() detects missing dates (i.e.,
dates that should be present based on the expected continuous time
series but are absent). The function addresses these missing rows by
adding rows for these dates and filling in NA for the other
value columns.
Note: When there is an overlap in the date
ranges of different station data, I manually(before running
process_and_save_data() to read the data ) remove the rows
with overlapping dates from the older station data and retain the
corresponding rows from the newer station data.
For example, given two date ranges:
Station A: October 1, 1944, to June 20, 2012
Station B: November 1, 1935, to March 31, 1946
I will preserve the data from Station B for the period from November 1, 1935, to September 30, 1944, and the data from Station A for the period from October 1, 1944, to June 20, 2012.This adjusted data then becomes the original raw data.
The raw data path is specified at the beginning of each station’s
analysis R Markdown file, located in
../climate_extreme_RA/reports_station/(# According Station Name)_heatwave_analysis.Rmd
This dataset provide the
1.data 介绍
2 web 介绍 (关于如何取grid 见onneonte) 如何script 介绍
3 code 介绍
(看onenote 关于时区)
week 3 4
Dataset ID: “Table: 32-10-0358-01 (formerly CANSIM 001-0014)”
Year Range: 1910-2024
Frequency: Annual
Variables: Avg Yield, unit is Hundredweight per harvested acre
Geography: Canada, Province-wide: British Columbia (BC); Other provinces
Rationale for choosing potatoes: According to (Sonnewald et al. 2015), potatoes are highly vulnerable to high temperatures, which negatively impact tuber development, storage, and seed potato fitness.
To investigate the potential correlation between temperature extremes, including heat waves, and long-term yield patterns in British Columbia, a comprehensive yield dataset is required.
Link to data source: Statistics Canada
If the link does not work, the data can be accessed through the official website of the Statistics Canada by following the navigation path outlined below:
Home > Data (in the search bar, search for “Area, production and farm value of potatoes” or Dataset ID: 32-10-0358-01) > Area, production and farm value of potatoes
Step1: click the Add/Remove button
Step2: the pic below shows the layout of column
filter option
Step2.1 Select the Geography:
Click on the Geography tab.
Don’t Check the box next to “Canada” as it will include nationwide data for the entire country.
Expand the list by clicking the “+” symbol next to “Canada.” and select the desired provinces: British Columbia (BC).
Step2.2 Choose the Variables:
Average yield, potatoesStep2.3 Click on the Reference period tab.
Step2.4 Customize the Layout:
Step 3: Download the dataset
The data is read within
../climate_extreme_RA/R_agricultural/read_data.R
# File paths for Potato data
file_paths <- c("../data/agri/Potato_Data.csv")
Within the process_and_save_data() function,
deal_with_non_exist_date() detects missing dates (i.e.,
dates that should be present based on the expected continuous time
series but are absent). The function addresses these missing rows by
adding rows for these dates and filling in NA for the other
value columns.
Note: When there is an overlap in the date
ranges of different station data, I manually(before running
process_and_save_data() to read the data ) remove the rows
with overlapping dates from the older station data and retain the
corresponding rows from the newer station data.